Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition
نویسندگان
چکیده
Designing good normalisation to counter the effect of environmental distortions is one of the major challenges for automatic speech recognition (ASR). The Vector Taylor series (VTS) method is a powerful and mathematically well principled technique that can be applied to both the feature and model domains to compensate for both additive and convolutional noises. One of the limitations of this approach, however, is that it is tied to MFCC (and log-filterbank) features and does not extend to other representations such as PLP, PNCC and phase-based front-ends that use power transformation rather than log compression. This paper aims at broadening the scope of the VTS method by deriving a new formulation that assumes a power transformation is used as the non-linearity during feature extraction. It is shown that the conventional VTS, in the log domain, is a special case of the new extended framework. In addition, the new formulation introduces one more degree of freedom which makes it possible to tune the algorithm to better fit the data to the statistical requirements of the ASR back-end. Compared with MFCC and conventional VTS, the proposed approach provides upto 12.2% and 2.0% absolute performance improvements on average, in Aurora-4 tasks, respectively.
منابع مشابه
Channel Compensation in the Generalised Vector Taylor Series Approach to Robust ASR
Vector Taylor Series (VTS) is a powerful technique for robust ASR but, in its standard form, it can only be applied to log-filter bank and MFCC features. In earlier work, we presented a generalised VTS (gVTS) that extends the applicability of VTS to front-ends which employ a power transformation non-linearity. gVTS was shown to provide performance improvements in both clean and additive noise c...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملA new algorithm for robust speech recognition: the delta vector taylor series approach
In this paper we present a new model-based compensation technique called Delta Vector Taylor Series (DVTS). This new technique is an extension and improvement over the Vector Taylor Series (VTS) approach [7] that addresses several of its limitations. In particular, we present a new statistical representation for the distribution of clean speech feature vectors based on a weighted vector codeboo...
متن کاملA unified framework of HMM adaptation with joint compensation of additive and convolutive distortions
In this paper, we present our recent development of a model-domain environment-robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task. The algorithm consists of two main steps. First, the noise and channel parameters are estimated using multi-sources of information including a nonlinear environment distortion model in the cepstral domai...
متن کاملSpeech recognition in noisy environments using first-order vector Taylor series
Ž . In this paper, we generalize relations between clean and noisy speech signal using vector Taylor series VTS expansion Ž . for noise-robust speech recognition. We use it for both the noisy data compensation and hidden Markov model HMM parameter adaptation, and apply it for the cepstral domain directly, while Moreno used it to estimate the log-spectral parameters. Also, we develop a detailed ...
متن کامل